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Abstract. Let M be an arbitrary Hermitian matrix of order n, and k 
be a positive integer < n. Wc show that if k is large, the distribution of 
eigenvalues on the real line is almost the same for almost all principal 
submatrices of M of order k. The proof uses results about random walks 
on symmetric groups and concentration of measure. In a similar way, 
we also show that almost all fe x n submatrices of M have almost the 
same distribution of singular values. 

Let M be a square matrix of order n. For any two sets of integers ii,. . . ,ik 
and ji, . . . ,ji between 1 and n, M{ii, . . . , i^; j'l, . . . denotes the subma- 
trix of M formed by deleting all rows except rows ii, . . . , i^, and all columns 
except columns ji, ■ ■ ■ ,ji. A submatrix like M(ii, . . . , i^; ii, . . . , i^) is called 
a principal submatrix. 

For a Hermitian matrix M of order n with eigenvalues Ai, . . . , A„ (repeated 
by multiplicities) , let Fm denote the empirical spectral distribution function 
of M, that is, 

Fm{x) := . 

n 

The following result shows that given 1 <C A; < n and any Hermitian matrix 
M of order n, the empirical spectral distril)ution is almost the same for 
almost every principal submatrix of M of order k. 

Theorem 1. Take any I < k < n and a Hermitian matrix M of order n. 
Let A be a principal submatrix of M chosen uniformly at random from the 
set of all k X k principal submatrices of M. Let F be the expected spectral 
distribution function of A, that is, F{x) = KFa{x). Then for each r >0, 

F{\\Fa - -Filoo > k-^/'^ + r) < uVke-"^^. 
Consequently, we have 
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Exactly the same results hold if A is akxn suhmatrix of M chosen uniformly 
at random, and Fa is the empirical distribution function of the singular 
values of A. Moreover, in this case M need not he Hermitian. 

Remarks, (i) Note that the bounds do not depend at all on the entries of M, 
nor on the dimension n. 

(ii) We think it is possible to improve the log k to \/log k using Theo- 
rem 2.1 of Bobkov j2] instead of the spectral gap techniques that we use. 
(See also Bobkov and Tetali [3].) However, we do not attempt to make this 
small improvement because \/log k, too, is unlikely to be optimal. Taking 
M to be the matrix which has n/2 I's on the diagonal and the rest of the 
elements are zero, it is easy to see that there is a lower bound of const.k~^/'^ . 
We conjecture that the matching upper bound is also true, that is, there is 
a universal constant C such that — -^||oo ^ Ck~^l'^ . 

(iii) The function F is determined by M and k. If M is a diagonal 
matrix, then F is exactly equal to the spectral measure of M, irrespective 
of k. However it is not difficult to see that the spectral measure of M cannot, 
in general, be reconstructed from F . 

(iv) The result about random k x n submatrices is related to the recent 
work of Rudelson and Vershynin [6! . Let us also refer to j6] for an extensive 
list of references to the substantial volume of literature on random subma- 
trices in the computing community. However, most of this literature (and 
also [S]) is concerned with the largest eigenvalue and not the bulk spectrum. 
On the other hand, the existing techniques are usually applicable only when 
M has low rank or low 'effective rank' (meaning that most eigenvalues are 
negligible compared to the largest one). 

A numerical illustration. The following simple example demonstrates that 
the effects of Theorem [T] can kick in even when k is quite small. We took 
M to be a n X n matrix for n = 100, with (z, j)th entry = mm{i,j}. This 
is the covariance matrix of a simple random walk up to time n. We chose 
k = 20, and picked two kxk principal submatrices A and B of M, uniformly 
and independently at random. Figure [T] plots to superimposed empirical 
distribution functions of A and B, after excluding the top 4 eigenvalues since 
they are too large. The classical Kolmogorov-Smirnov test from statistics 
gives a p-value of 0.9999 (and IIF4 — Fb||oo = 0.1), indicating that the two 
distributions are statistically indistinguishable. 

Markov chains. Let us now quote two results about Markov chains that we 
need to prove Theorem[T| Let X be a finite or countable set. Let Il{x, y) > 
satisfy 

for every x E X. Assume furthermore that there is a symmetric invariant 
probability measure ^ on X, that is, H(a;, y)^({x}) is symmetric in x and y, 
and X^a; -'^(^' ~ /^({y}) foi' every y G X. In other words, (H,/x) is a 
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Figure 1. Superimposed empirical distribution functions of 
two submatrices of order 20 chosen at random from a deter- 
ministic matrix of order 100. 



reversible Markov chain. For every / : X — > M, define 

£(/,/) = \Y. (/(^) -/(y))'n(x,yH{x}). 

The spectral gap or the Poincare constant of the chain (11, ^u) is the largest 
Ai > such that for all /'s, 

AiVar^(/) <£(/,/). 

Set also 

(1) |||/|||L = 2Sup5](/(x)-/(y)mx,y). 

The following concentration result is a copy of Theorem 3.3 in [5]. 

Theorem 2 ([5], Theorem 3.3). Let (n,/i) he a reversible Markov chain on 
a finite or countable space X with a spectral gap Ai > 0. Then, whenever 
f : X ^ R is a function such that |||/|||oo < 1, we have that f is integrable 
with respect to fi and for every r > 0, 



K{f>Ifdf^ + r})<3e-^^-/\ 
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Let us now specialize to X = Sn, the group of all permutations of n ele- 
ments. The following transition kernel 11 generates the 'random transposi- 
tions walk'. 

1/n if vr' = TT, 

if vr' = vrr for some transposition r, 
otherwise. 



(2) U{7T,7t') 



It is not difficult to verify that the uniform distribution ^ on Sn is the unique 
invariant measure for this kernel, and the pair (11, fi) defines a reversible 
Markov chain. 

Theorem 3 (Diaconis & Shahshahani [4 , Corollary 4). The spectral gap of 
the random transpositions walk on Sn is 2/n. 

We are now ready to prove Theorem [T] 

Proof of Theorem^ Let vr be a uniform random permutation of {1, . . . , n}. 
Let A = A{n) = M(vri, . . . ,vrfc;vri, . . . ,vrfc). Fix a point x G M. Let 

fin) := Fa{x). 

Let n be the transition kernel for the random transpositions walk defined 
in ([2]) , and let 1 1 1 • 1 1 1 00 be defined as in ([T| . 

Now, by Lemma 2.2 in Bai [1], we know that for any two Hermitian 
matrices A and B of order k, 

,, ,, Tank(A - B) 

(3) \\Fa-Fb\\oo< ^-j, -. 

Let r = (/, J) be a random transposition, where /, J are chosen indepen- 
dently and uniformly from { 1 , . . . , n} . Multiplication by r results in taking 
a step in the chain defined by H. Now, for any a £ Sn, the k x k Hermitian 
matrices A(a) and A^ar) differ at most in one column and one row, and 
hence rank(^(cT) — A^ar)) < 2. Thus, 

(4) \f{<T)-f{aT)\<l. 

Again, if / and J both fall outside {1, . . . , k}, then A(a) = A^ar). Combin- 
ing this with ([3]) and Q, we get 

Therefore, from Theorems |2] and [3] it follows that for any r > 0, 

(5) P(|F^(x)-F(x)|>r)<6exp(^-^^p=) =6exp(^-^ 

The above result is true for any x. Now, if Fa{x—) := limy-^x F^Aiu), then 
by the bounded convergence theorem we have KFa{x—) = liuiy-^^ F(y) = 
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F(x—). It follows that for every r, 

P(|Fa(x-) - EFa{x-)\ >r) < liminf P(|F4(y) - F{y)\ > r) 

y]x 



Since this holds for all r, the > can be replaced by >. Similarly it is easy 
to show that F is a legitimate cumulative distribution function. Now fix an 
integer I >2, and for 1 < i < / let 

ti := inf{x : F{x) > i/l}. 

Let to = "OO and ti = oo. Note that for each i, F(ti^i—) —F(ti) < l/l. Let 

A = (max \FA{ti) - F{ti)\) V (max \FA{ti-) - F{ti-)\). 

l<i<l l<i<l 

Now take any x G M. Let i be an index such that ti < x < tj+i. Then 
Fa{x) < FA{ti+i-) < F{ti+i-) + A < F{x) + l/l + A. 

Similarly, 

Fa{x) > Fa{U) > F{ti) - A > F{x) - l/l - A. 
Combining, we see that 

\\Fa-F\\^ < 1/1 + A. 

Thus, for any r > 0, 

F{\\Fa - F|U > l/l + r)< 12{l - l)e-'^V^\ 
Taking / = [fc^/^] + 1, we get for any r > 0, 

P(||F4 - F||oo >l/Vk + r)< uVke-""^. 

This proves the first claim of Theorem [T} To prove the second, using the 
above inequality, we get 

^„ l + \/81ogA; _/„^ ^„ l + ^/81ogA; 
K\\Fa-F\\^ < ^ +P(^||FA-i^||oo > ^ 

^ 13 + VSlogfc 

For the case of singular values, we proceed as follows. As before, we let 
vr be a random permutation of {!,..., n}; but here we define A^ir) = 
M{tti, . . . , TTfc; 1, . . . , n). Since singular values of A are just square roots 
of eigenvalues of AA* , therefore 

\\Fa-E{Fa)\\oo = \\Faa' -nFAA')\\oo, 

and so it suffices to prove a concentration inequality for Faa*- As before, 
we fix X and define 

f{TT)=FAA'{x). 
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The crucial observation is that by Lemma 2.6 of Bai [I], we have that for 
any two k x n matrices A and B, 

rank(^ - B) 

\\Faa* - Fbb*\\oo < -, • 

k 

The rest of the proof proceeds exactly as before. □ 
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